Skip to content

Performance optimisations for simulation calculate and parameter lookups#436

Merged
MaxGhenis merged 6 commits intomasterfrom
perf/fast-cache
Mar 14, 2026
Merged

Performance optimisations for simulation calculate and parameter lookups#436
MaxGhenis merged 6 commits intomasterfrom
perf/fast-cache

Conversation

@nikhilwoodruff
Copy link
Contributor

@nikhilwoodruff nikhilwoodruff commented Feb 18, 2026

Two sets of performance improvements for the simulation hot path.

1. Fast cache for repeated variable lookups — Flat dict[(variable_name, str(period)), array] at the Simulation level, checked at the very top of calculate() before the tracer and full _calculate() machinery. Only active when map_to=None and decode_enums=False (the inner-loop hot path used by formulas calling dependencies). Invalidation mirrors the existing holder cache.

2. Vectorial parameter lookup optimisation — Replaces O(N×K) numpy.select in VectorialParameterNodeAtInstant.__getitem__ with O(N) index-based selection. For enum/EnumArray keys, builds a lookup table mapping integer codes directly to child indices, skipping string conversion. For string keys, uses numpy.unique to reduce comparisons. Also caches build_from_node results on ParameterNodeAtInstant to avoid rebuilding the recarray on every access.

US household_net_income compute: 12.8s → 9.0s (-30%). All existing tests pass (tracer test failures are pre-existing).

@nikhilwoodruff
Copy link
Contributor Author

Good catch — the fast path was skipping tracer.record_calculation_start() even when trace=True. Added and not self.trace to the guard so FullTracer sees all calculations as before. No perf impact for the normal (non-trace) path.

@nikhilwoodruff nikhilwoodruff changed the title Add _fast_cache to Simulation for O(1) repeated variable lookups Performance optimisations for simulation calculate and parameter lookups Mar 8, 2026
nikhilwoodruff and others added 5 commits March 14, 2026 07:17
Adds a flat dict[tuple[str,str], array] at the Simulation level, checked at the
top of calculate() before tracer, random seed and _calculate() machinery. Only
active when map_to=None and decode_enums=False (the inner-loop hot path).

Invalidation mirrors the existing holder cache:
- purge_cache_of_invalid_values() removes invalidated entries
- delete_arrays() removes the relevant key(s)
- clone() gets a fresh empty cache to prevent cross-simulation sharing

Uses getattr/hasattr guards so StubSimulation and other test subclasses that
bypass __init__ work without modification.

Co-Authored-By: Claude <noreply@anthropic.com>
Skip the fast path when tracing is enabled, so FullTracer records all
calculations correctly.

Co-Authored-By: Claude <noreply@anthropic.com>
Replace O(N×K) numpy.select with O(N) index-based selection in
VectorialParameterNodeAtInstant.__getitem__. For enum/EnumArray keys,
build a lookup table mapping integer codes directly to child indices,
avoiding the intermediate string conversion entirely. For string keys,
use numpy.unique to reduce N×K string comparisons to U dict lookups
(where U = unique keys, typically ≪ N).

Also cache build_from_node results on ParameterNodeAtInstant to avoid
rebuilding the recarray on every vectorial access.

US household_net_income compute: 12.8s → 9.0s (-30%).

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use getattr for _trace in fast cache guard to handle StubSimulation
  subclasses that bypass Simulation.__init__
- Fix float() conversion for recarray elements in scalar lookup path

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@MaxGhenis MaxGhenis merged commit 1a8eeb5 into master Mar 14, 2026
16 checks passed
@MaxGhenis MaxGhenis deleted the perf/fast-cache branch March 14, 2026 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants